11 research outputs found
Inkrementelle Koreferenzanalyse für das Deutsche
Es wird ein inkrementeller Ansatz zur Koreferenzanalyse
deutscher Texte vorgestellt. Wir zeigen
anhand einer breiten empirischen Untersuchung,
dass ein inkrementelles Verfahren einem nichtinkrementellen
überlegen ist und dass jeweils die
Verwendung von mehreren Klassifizierern bessere
Resultate ergibt als die Verwendung von nur einem.
Zudem definieren wir ein einfaches Salienzmass,
dass annähernd so gute Ergebnisse ergibt wie ein
ausgefeiltes, auf maschinellem Lernen basiertes
Verfahren. Die Vorverarbeitung erfolgt ausschliesslich
durch reale Komponenten, es wird nicht - wie
so oft - auf perfekte Daten (z.B. Baumbank statt
Parser) zurückgegriffen. Entsprechend tief sind die
empirischen Ergebnisse. Der Ansatz operiert mit
harten linguistischen Filtern, wodurch die Menge
der Antezedenskandidaten klein gehalten wird. Die
Evaluierung erfolgt anhand der Koreferenzannotationen
der TüBa-D/Z
Anaphora Resolution with Real Preprocessing
In this paper we focus on anaphora resolution for German, a highly inflected language which also allows for closed form compounds (i.e. compounds without spaces). Especially, we describe a system that only uses real preprocessing components, e.g. a dependency parser, a two-level morphological analyser etc. We trace the performance drop occurring under these conditions back to underspecification and ambiguity at the morphological level. A demanding subtask of anaphora resolution are the so-called bridging anaphora, a special variant of nominal anaphora where the heads of the coreferent noun phrases do not match. We experiment with two different resources in order to find out how to cope best with this problem
An incremental model for coreference resolution with restrictive antecedent accessibility
We introduce an incremental model for coreference resolution that competed in the CoNLL 2011 shared task (open regular). We decided to participate with our baseline model, since it worked well with two other datasets. The benefits of an incremental over a mention-pair architecture are: a drastic reduction of the number of candidate pairs, a means to overcome the problem of underspecified items in pairwise classification and the natural integration of global constraints such as transitivity. We do not apply machine learning, instead the system uses an empirically derived salience measure based on the dependency labels of the true mentions. Our experiments seem to indicate that such a system already is on par with machine learning approaches
An incremental entity-mention model for coreference resolution with restrictive antecedent accessibility
We introduce an incremental entity-mention model for coreference resolution. Our experiments show that it is superior to a non-incremental version in the same environment. The benefits of an incremental architecture are: a reduction of the number of candidate pairs, a means to overcome the problem of underspecified items in pairwise classification and the natural integration of global constraints such as transitivity. Additionally, we have defined a simple salience measure that - coupled with the incremental model - proved to establish a challenging baseline which seems to be on par with machine learning based systems of the 2010's SemEval shared task
An incremental model for the coreference resolution task of BioNLP 2011
We introduce our incremental coreference resolution system for the BioNLP 2011 Shared Task on Protein/Gene Znteraction. The benefits of an incremental architecture over a mentionpair model are: a reduction of the number of candidate pairs, a means to overcome the problem of underspecified items in pair-wise classification and the natural integration of global constraints such as transitivity. A filtering system takes into account specific features of different anaphora types. We do not apply Machine Learning, instead the system classifies with an empirically derived salience measure based on the dependency labels of the true mentions. The OntoGene pipeline is used for preprocessing
Improving protein coreference resolution by simple semantic classification
<p>Abstract</p> <p>Background</p> <p>Current research has shown that major difficulties in event extraction for the biomedical domain are traceable to coreference. Therefore, coreference resolution is believed to be useful for improving event extraction. To address coreference resolution in molecular biology literature, the Protein Coreference (COREF) task was arranged in the BioNLP Shared Task (BioNLP-ST, hereafter) 2011, as a supporting task. However, the shared task results indicated that transferring coreference resolution methods developed for other domains to the biological domain was not a straight-forward task, due to the domain differences in the coreference phenomena.</p> <p>Results</p> <p>We analyzed the contribution of domain-specific information, including the information that indicates the protein type, in a rule-based protein coreference resolution system. In particular, the domain-specific information is encoded into semantic classification modules for which the output is used in different components of the coreference resolution. We compared our system with the top four systems in the BioNLP-ST 2011; surprisingly, we found that the minimal configuration had outperformed the best system in the BioNLP-ST 2011. Analysis of the experimental results revealed that semantic classification, using protein information, has contributed to an increase in performance by 2.3% on the test data, and 4.0% on the development data, in F-score.</p> <p>Conclusions</p> <p>The use of domain-specific information in semantic classification is important for effective coreference resolution. Since it is difficult to transfer domain-specific information across different domains, we need to continue seek for methods to utilize such information in coreference resolution.</p
Geschichte des Lehrerinnen- und Lehrerberufs
Der Beitrag thematisiert die Geschichte des Lehrberufs im Kontext von Verstaatlichungsprozessen und des Ausbaus von Wissen(schafts)systemen. Anhand von drei Länderbeispielen (Deutschland, Schweiz und Österreich) werden die Entwicklungen nachgezeichnet, um abschließend einige charakteristische Merkmale und neuralgische Punkte auszuarbeiten